0%

(ICCV 2017) Escape from cells:Deep kd-networks for the recognition of 3d point cloud models

Klokov R, Lempitsky V. Escape from cells: Deep kd-networks for the recognition of 3d point cloud models[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 863-872.



1. Overview


1.1. Motivation

  • rasterize 3D models onto uniform voxel grids lead to large memory footprint and slow process time
  • there exist a large number of indexing strucutres (kd-tree, oc-trees, binary spatial partition tree, R-trees and constructive solid geometry)

In this paper, it proposed Kd-Networks

  • divide the point cloud to construct the kd-tree
  • perform multiplicative transformation and share parameters of these transformation (mimic ConvNet)
  • not rely on grids and avoid poor scaling behavior


  • 3D Conv (+ GAN)
  • 2D Conv (2D projection of 3D obj)
  • spectral Conv
  • PointNet
  • RNN
  • OctNet (Oct-Trees)
  • Graph-based ConvNet

1.3. Dataset

  • classification. ModelNet10, ModelNet40
  • shape retrieval. SHREC’16
  • shape part segmentation. ShapeNet part dataset



2. Network




2.1. Input

Recur to divide the point clouds into two equally-sized subsets. get N - 1 nodes and each divide direction d_i (along x, y or z).

  • N. the fixed size of point cloud (sub-sample or over-sample)
  • d_i. divide direction of each level
  • l_i. the level of tree
  • c_1(i) = 2i, c_2(i) = 2i + 1. children of ith node

2.2. Processing Data with Kd-Net

Given a kd-tree, compute the representation v_i of each node. In the ith level, apply the sharing layer to the same divide direction node.



  • v_i. the representation of ith node
  • φ. Relu
  • []. concate
  • W, b. parameters of the layer in ith level, d_i direction (dimension: 2m_{l+1} x m_l, m_l)

2.3. Classification



2.4. Shape Retrieval

  • output a descriptor vector (remove trained classifier of Classification)
  • histogram loss. also can use Siamese loss or triplet loss

2.5. Segmentation



  • mimic encoder-decoder (Hourglass)
  • skip connection


2.6. Properties

  • Layerwise Parameter Sharing
    • CNN. share kernels for each localized multiplication
    • Kd-Net. share kernel (1x1) for points with same split direction in same level
  • Hierarchical Representation
  • Partial Invariance to Jitter
    • split direction
  • Non-invariance to Rotation
  • Role of kd-tree Structure
    • Kd-tree determine the the combination order of leaf representation
    • Kd-tree can be regarded as a shape descriptor


2.7. Details

  • normalize 3D coordinates. [-1, 1]^3 and put the origin at centroid
  • data augmentation. perturbing geometric transformation, inject randomness into kd-tree construction (direction probability)


  1. γ=10.



3. Experiments


3.1. Details

  • MNIST→2D Point Cloud. point of the pixel center
  • 3D Point Cloud. sample faces→ sample point from face
  • Self-ensemble in test time
  • Augmentation
    • TR. translation long axis ±0.1
    • AS. anisotropic rescaling
    • DT. deterministic tree
    • RT. randomized tree

3.2. Classification



3.3. Ablation




3.4. Shape Retrieval



  • 20 rotations→pooling→FC

3.5. Part Segmentation

  • duplicated random sample with an addition of a small noise. help with rare calss
  • during test, predict on upsampled cloud and then obtain the mapping of original points


  • low memory footprint < 120 MB